Pre-Infection Innate Immunity Attenuates SARS-CoV-2 Infection and Viral Load in iPSC-Derived Alveolar Epithelial Type 2 Cells

A large portion of the heterogeneity in coronavirus disease 2019 (COVID-19) susceptibility and severity of illness (SOI) remains poorly understood. Recent evidence suggests that SARS-CoV-2 infection-associated damage to alveolar epithelial type 2 cells (AT2s) in the distal lung may directly contribute to disease severity and poor prognosis in COVID-19 patients. Our in vitro modeling of SARS-CoV-2 infection in induced pluripotent stem cell (iPSC)-derived AT2s from 10 different individuals showed interindividual variability in infection susceptibility and the postinfection cellular viral load. To understand the underlying mechanism of the AT2′s capacity to regulate SARS-CoV-2 infection and cellular viral load, a genome-wide differential gene expression analysis between the mock and SARS-CoV-2 infection-challenged AT2s was performed. The 1393 genes, which were significantly (one-way ANOVA FDR-corrected p ≤ 0.05; FC abs ≥ 2.0) differentially expressed (DE), suggest significant upregulation of viral infection-related cellular innate immune response pathways (p-value ≤ 0.05; activation z-score ≥ 3.5), and significant downregulation of the cholesterol- and xenobiotic-related metabolic pathways (p-value ≤ 0.05; activation z-score ≤ −3.5). Whilst the effect of post-SARS-CoV-2 infection response on the infection susceptibility and postinfection viral load in AT2s is not clear, interestingly, pre-infection (mock-challenged) expression of 238 DE genes showed a high correlation with the postinfection SARS-CoV-2 viral load (FDR-corrected p-value ≤ 0.05 and r2-absolute ≥ 0.57). The 85 genes whose expression was negatively correlated with the viral load showed significant enrichment in viral recognition and cytokine-mediated innate immune GO biological processes (p-value range: 4.65 × 10−10 to 2.24 × 10−6). The 153 genes whose expression was positively correlated with the viral load showed significant enrichment in cholesterol homeostasis, extracellular matrix, and MAPK/ERK pathway-related GO biological processes (p-value range: 5.06 × 10−5 to 6.53 × 10−4). Overall, our results strongly suggest that AT2s’ pre-infection innate immunity and metabolic state affect their susceptibility to SARS-CoV-2 infection and viral load.


Introduction
Coronavirus disease 2019 (COVID- 19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), manifests through a broad spectrum of symptoms and symptom severity; many who were infected remained asymptomatic (20-40%), and for those who developed symptoms, the disease ranged from mild cold-like symptoms to severe illness and even death [1,2].This heterogeneity in disease susceptibility and severity of illness (SOI) has been common for all SARS-CoV-2 variants.The major risk factors for severe disease have been age, sex, and the presence of comorbidities such as cardiovascular Cells 2024, 13, 369 2 of 19 disease, diabetes, hypertension, and obesity.The geographical location, social vulnerabilities, race, and ethnic differences were also suggested to influence the disease [3].However, most of the variation in disease susceptibility and SOI remains poorly understood.
SARS-CoV-2 first infects and replicates in the ACE2-expressing cells of the upper respiratory tract and then spreads to the alveolar epithelial cells in the distal lung [4,5].The viral load in the nasopharynx usually peaks within the first week of infection and was reported to be similar in both asymptomatic and symptomatic patients [6,7].However, a faster viral clearance was reported in asymptomatic individuals than in those who were symptomatic [8].Studies of the host response to SARS-CoV-2 infection in the upper respiratory tract using in vitro cell models and patients' nasopharyngeal swabs have shown that SARS-CoV-2 instigates a strong innate immune response via pathogen recognition receptor (PRR)-activated expression of the interferons (IFNs) and IFN-stimulated genes (ISGs), albeit with delayed kinetics, and PRR/IFN response failed to curtail viral replication in some cases [9][10][11].Autoantibodies against type I IFNs in the blood of some patients before SARS-CoV-2 infection have been associated with severe COVID-19 [12].Furthermore, in human in vitro cell models of SARS-CoV-2 infection, pretreatment with exogenous IFNs or a rhinovirus (RV) infection-induced IFN/ISG response blocked SARS-CoV-2 infection.However, exogenous IFNs added post-SARS-CoV-2 infection were significantly less effective in stopping the viral replication, and SARS-CoV-2 proteins inhibited various steps in endogenous IFN production and response [9,11,13,14].Nevertheless, the spread of SARS-CoV-2 infection in the distal lung and associated damage to the alveolar epithelial cells in severe COVID-19 cases cause acute respiratory distress syndrome (ARDS), the predominant cause of COVID-19-related deaths.An imaging mass cytometry analysis of COVID-19 patients' postmortem lung tissue showed that SARS-CoV-2 predominantly infects alveolar epithelial cells in the lung, and damage to alveolar epithelial type 2 cells (AT2s) was positively correlated with clinical factors such as organizing pneumonia, the number of days that elapsed since symptom onset, hospitalization and intubation, and lung weight at death [15], suggesting a critical role of these cells in COVID-19 pathology.The AT2s constitute 10-15% of all alveolar cells and are critical for alveolar homeostasis and preventing alveolar collapse in a number of ways, including (1) the production of surfactant protein (S-protein; SP); (2) stabilization of the airway epithelial barrier; (3) immune defense; and (4) regeneration of distal lung epithelial cells through their progenitor function [16][17][18].Lineage-tracing models have demonstrated that AT2s proliferate and differentiate into alveolar epithelial type 1 cells (AT1s), both in lung injury and in normal unstressed repair [16,19].Damage to or dysregulated AT2 function has been implicated in pulmonary fibrosis [20].AT2s express ACE2 and transmembrane serine protease 2 (TMPRSS2), both of which the virus uses for entry into the cell [4,21].Consistent with the above line of evidence, COVID-19 patients who develop ARDS show a high degree of tissue damage and dysregulated tissue repair with a high level of fibrosis compared to non-COVID-19-associated ARDS and lung injuries [15].Furthermore, a high level of pulmonary fibrosis is one of the most severe complications that causes permanent lung damage in COVID-19 patients, plausibly directly implicating AT2s' dysregulated function [22].Despite significant improvement in our understanding of COVID-19 lung pathology and the cellular landscape involved, in the last few years, factors influencing the disease's susceptibility and SOI remain largely unidentified.
The heritability estimates of the predicted COVID-19 phenotypic variance in the Twin-sUK cohort were reported at 31%, which suggests a significant contribution of host genetic factors to the disease variability [23].However, like other complex diseases, genome-wide association study (GWAS)-identified COVID-19 risk loci/variants (reviewed in Cappadona et al., 2023;[24]) explain a substantially small portion of the COVID-19 phenotypic variance, with only about 6.5% of the COVID-19 symptom heritability explained by common variants [25].Therefore, alternative approaches are necessary to identify the genetic determinants of COVID-19 susceptibility and SOI.
In this study, we performed a transcriptome-wide analysis of the cellular response to SARS-CoV-2 infection challenge in AT2s derived from the induced pluripotent stem cells (iPSCs) of 10 individuals of our longitudinal Mexican American Family Study (MAFS) to understand the mechanisms of the AT2 ′ s capacity to regulate SARS-CoV-2 infection and cellular viral load.

Materials and Methods
Validated iPSC lines reprogrammed from the lymphoblastoid cell lines of 10 MAFS participants using a methodology we have published previously [26,27] were used for the AT2 generation.

SARS-CoV-2 Infection Assay
AT2 monolayer cultures were maintained in AT2 medium (described above) until 60 to 70% confluent and then transferred to the University of Texas Rio Grande Valley (UTRGV) Biosafety Level 3 (BSL3) containment laboratory.All SARS-CoV-2 viral procedures were performed in compliance with containment procedures in the BSL3 as approved by UTRGV's Institutional Biosafety Committee and the Department of Environmental Health, Safety, and Risk Management, under an approved protocol (2019-003-HBA).For the SARS-CoV-2 infection assay, the SARS-CoV-2 USA-WA1/2020 viral strain was obtained from the World Reference Center for Emerging Viruses and Arboviruses at the University of Texas Medical Branch, Galveston, TX (a generous gift from Dr. Kenneth Plante), and a working stock of 6-7 log 10 of the virus was generated following the method described in Harcourt et al. (2020) [31].For each sample, infections were performed in triplicate with a multiplicity of infection (MOI) = 10, and the other three replicates were mock-infected (treated with medium without virus).After 2 h of viral inoculation, viral and mock media were replaced with fresh AT2 medium, and cells were returned to the cell culture incubator for 48 h.After 48 h, one replicate of each of the mock and SARS-CoV-2 infection-challenged AT2s was fixed using 4% paraformaldehyde solution (Millipore Sigma, St. Louis, MO, USA), and the other two replicates of each of the mock and SARS-CoV-2 infection-challenged cells were harvested by digestion with RNA extraction buffer.The tubes containing cell digest for RNA extraction and the plates of fixed AT2s were sealed, surface decontaminated, and transferred to the BSL2 laboratory for further analysis.

RNA Extraction, RT-PCR, and RNA Sequencing
Total RNA was extracted from each of the mock and SARS-CoV-2 infection-challenged AT2s (2 replicates each of the mock-and SARS-CoV-2-infected cells per sample) using a commercially available RNeasy Mini Kit (Qiagen, Germantown, MD, USA) and the manufacturer's protocol.RNA quality and quantity were assessed using a NanoDrop 2000 Spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) and an Agilent 2200 TapeStation system (Agilent, Santa Clara, CA, USA).
Relative quantification of the viral load in the SARS-CoV-2-infected replicates 2 and 3 (R2 and R3) was performed by RT-PCR analysis using primers and probe sequences specific to the SARS-CoV-2 nucleocapsid protein gene, as described in the US CDC rRT-PCR panel N1 assay in Lu et al. (2020) [32].The primers and probe were synthesized and obtained commercially (Eurofins Genomics Company, Louisville, KY, USA).The measured viral RNA quantities were normalized by the expression of the human housekeeping gene GAPDH as an endogenous control (Applied Biosystems, ThermoFisher Scientific, Waltham, MA, USA).The RT-PCR was performed using TaqPath™ 1-Step multiplex master mix and on a QuantStudio™ 3 Real-Time PCR instrument (ThermoFisher Scientific, Waltham, MA, USA).
RNA sequencing of both mock and SARS-CoV-2 infection-challenged replicates (2 replicates each per sample) was performed using the Illumina Stranded mRNA Prep Ligation Kit on an Illumina NovaSeq6000 instrument.Briefly, 1µg of high-quality total RNA per sample per replicate and the reagents supplied in Illumina Stranded mRNA sample preparation kit v2 (Illumina, Inc., San Diego, CA, USA) were used to prepare mRNA sequencing libraries.Oligo-dT magnetic beads supplied in the kit were used to first enrich the poly-A tail containing mRNA molecules from the total RNA.Divalent cations and elevated temperature were then used to fragment the enriched mRNA into ~200-600 base pair-sized molecules.The first-strand cDNA was synthesized from cleaved RNA fragments using reverse transcriptase and random primers, followed by second-strand cDNA synthesis using DNA polymerase-I and RNase H.The synthesized cDNA fragments were end-repaired, and adaptors were ligated.The resulting cDNA libraries were purified, enriched by PCR, and then deep sequenced on an Illumina NovaSeq6000 instrument.

RNA Sequencing Analyses
The Illumina bcl2fastq2 software v2.20.0 (Illumina, Inc., San Diego, CA, USA) was used to generate and demultiplex raw fastq sequence files.After pre-alignment QCs, sequences were aligned to human genome assembly GRCh38 (hg38) and mapped to RefSeq transcripts using StrandNGS software v4.1 (Strand Life Sciences Pvt. Ltd., Bangalore, India).The aligned reads were filtered based on default read quality metrics, and log transformation and "DESeq" normalization were applied.Known genes/mRNAs with a normalized read count (NRC) ≥ 10 and conditions as described in the results were considered for various differential gene expression analyses.

Differential Gene Expression Analyses
To identify genes that were significantly differentially expressed (DE) between various conditions described in the results, moderated t statistics or one-way ANOVA and expression fold change (FC) analyses were performed.The genes with moderated t statistics/oneway ANOVA FDR-corrected p-value ≤ 0.05 and FC absolute (FC abs) ≥ 2.0 between the pair(s) of conditions were considered significantly differentially expressed.

Functional Annotations and Enrichment Analyses
Disease/functional annotations and enrichment analyses of the gene sets of interest, identified by the differential gene expression analyses detailed in the results, were performed using the Ingenuity Pathway Analysis (IPA) platform (QIAGEN Digital Insights, Redwood City, CA, USA) and 'Enrichr' and 'ShinyGO v0.77' gene set enrichment analysis web tools [33,34] (accessed November 2023).In IPA, right-tailed Fisher's exact test FDR-corrected p-values were used for enrichment significance, and the direction of functional change was assessed by the activation z-score detailed in Kramer et al. (2014) [35].Briefly, in IPA activation z-score is used to infer activation states of an enriched biological function.The basis of this inference is the relationship between genes and biological function(s) that are literature-derived.The direction of effect (up-or downregulated) is determined by DE genes in the data set and the direction of the gene's effect on the biological function.A positive z-score indicates upregulation, and a negative z-score indicates downregulation of the biological function.The 'Enrichr' implements several enrichment scores as described in Chen et al. (2013) [33]; we ranked our 'Enrichr' results based on computed Fisher exact test p-values.In 'ShinyGO' enrichment analyses, FDR is calculated based on nominal p-value from the hypergeometric test and FDR-corrected p-values ≤ 0.05 were used for statistical significance.

AT2 Generation
Existing iPSC lines from 10 individuals of our longitudinal MAFS were differentiated into AT2s using a four-stage differentiation protocol summarized in Figure 1a, which uses a sequential modulation of the signaling pathways by small molecules and biologicals detailed in the methods.All generated AT2 lines showed high expression of AT2-specific markers NKX2-1, EpCAM, SFTPB, and SFTPC measured by ICC both in 3D alveolosphere and monolayer AT2 cultures (Figure 1b-d).An HCS analysis of the ICC-stained AT2 markers, which quantified five to eight thousand cells across nine randomly chosen visual fields per AT2 line showed that >90% of the generated cells across all 10 AT2 lines expressed NKX2-1, a key lung epithelial cell transcription factor; other AT2-specific markers EpCAM, SFTPB, and SFTPC were expressed in ~99 to 100% of generated cells (Figure 1c,d).These results suggest a highly uniform AT2 differentiation across the 10 samples.
In a differential gene expression analysis between iPSCs' and AT2s' (control replicates R2 and R3) expressed transcriptomes (16,038 genes that were expressed (NRC ≥ 10) in iPSCs and/or differentiated AT2 lines), a total of 7422 genes were significantly differentially expressed (moderated t statistics FDR-corrected p-value ≤ 0.05 and FC abs ≥ 2.0) and accounted for ~40% of the iPSC's and AT2 ′ s expressed transcriptome (Figure 2a).

Individual-Specific Variation in AT2s' Susceptibility to SARS-CoV-2 Infection and Viral Load
Quantitative measures of the AT2s' susceptibility to SARS-CoV-2 infection and the postinfection viral load are shown in Figure 3a-d.The replicates R1 of each AT2 line, both after the mock and SARS-CoV-2 infection challenge, were analyzed by ICC labeling of the SARS-CoV-2 nucleocapsid protein (NP) and HCS quantification of the NP-positive (SARS-CoV-2-infected) cells as a surrogate measure of AT2s' susceptibility to SARS-CoV-2 infection (Figure 3a,b).The NP's ICC fluorescence was quantified in the cell cytoplasm, and the cells that passed a set NP ICC fluorescence threshold across all AT2 lines were counted as SARS-CoV-2-infected, as detailed in Figure 3c(i,ii).The other two replicates (R2 and R3) were quantified for NP gene expression as a measure of cellular viral load after the mock and SARS-CoV-2 infection challenge using RT-PCR.Both HCS-based quantitative measures of SARS-CoV-2-infected cells and the RT-PCR-based measures of the viral load showed high variation across the AT2 lines of 10 individuals, suggesting a significantly high individual-specific differential response to SARS-CoV-2 infection challenge (Figure 3b,d).Additionally, quantitative measures of infection susceptibility and viral load were highly reproducible (similar) between replicates (r 2 = 0.938 between R1 and R2; 0.908 between R1 and R3; and 0.983 between R2 and R3), suggesting that the donor-specific variation observed in SARS-CoV-2 infection-induced cellular response was likely due to biological factors.The expression of ACE2 was significantly downregulated (FC = 2.64) postinfection (Figure 3e) and showed a high correlation (r 2 ~0.70) with the viral load measures.

AT2s' Transcriptomic Response Was Highly Correlated with SARS-CoV-2 Infection and Viral Load Measures
To investigate the transcriptome-wide AT2 cellular response to SARS-CoV-2 infection, 15,640 genes with an NRC ≥ 10 in at least 50% of the replicates were considered expressed.A pairwise differential gene expression analysis between mock and SARS-CoV-2 infectionchallenged replicates across 10 AT2 lines (two replicates per condition per line) using these expressed genes identified 1393 genes that were significantly (one-way FDRcorrected p ≤ 0.05; FC abs ≥ 2.0) DE between mock and SARS-CoV-2 infection-challenged AT2s (Table S1).The SARS-CoV-2 infection-challenged expression change in 712 and 645 DE genes showed >25% correlation (r 2 absolute > 0.25) with the cellular measures of SARS-CoV-2-infection and viral load, respectively.The top and bottom 20 genes whose expression was either significantly upregulated or downregulated (highest average FC abs across 10 samples) suggest upregulation of the innate immune response and downregulation of metabolic processes in SARS-CoV-2 infection-challenged AT2s (Figure 4).lines).(e) Change in ACE2 gene expression between mock and SARS-CoV-2 infection-challenged AT2s.

AT2s' Transcriptomic Response Was Highly Correlated with SARS-CoV-2 Infection and Viral Load Measures
To investigate the transcriptome-wide AT2 cellular response to SARS-CoV-2 infection, 15,640 genes with an NRC ≥ 10 in at least 50% of the replicates were considered expressed.A pairwise differential gene expression analysis between mock and SARS-CoV-2 infection-challenged replicates across 10 AT2 lines (two replicates per condition per line) using these expressed genes identified 1393 genes that were significantly (one-way ANOVA FDR-corrected p ≤ 0.05; FC abs ≥ 2.0) DE between mock and SARS-CoV-2 infection-challenged AT2s (Table S1).The SARS-CoV-2 infection-challenged expression change in 712 and 645 DE genes showed >25% correlation (r 2 absolute > 0.25) with the cellular measures of SARS-CoV-2-infection and viral load, respectively.The top and bottom 20 genes whose expression was either significantly upregulated or downregulated (highest average FC abs across 10 samples) suggest upregulation of the innate immune response and downregulation of metabolic processes in SARS-CoV-2 infection-challenged AT2s (Figure 4).For a more comprehensive characterization of the AT2 transcriptomic response, functional annotation and enrichment analyses of all the 1393 DE genes in gene ontology (GO) biological processes, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and IPA canonical pathways were performed, which further validates that SARS-CoV-2 infection induced a strong innate immune response in AT2s and viral infection-related cellular innate immune response pathways were significantly enriched (p-value ≤ 0.05) and highly upregulated (z-score ≥ 3.5), whilst the cholesterol-and xenobiotic-related metabolism pathways were significantly downregulated (Figure 5a-c).Similarly, the repertoire of upstream regulators that were significantly enriched and predicted to be differentially regulated in SARS-CoV-2 infection-challenged AT2s (p-value ≤ 0.01; activation z-score-absolute ≥ 6) also showed significant upregulation of the interferon regulatory factors (IRFs: IRF1, IRF3, and IRF7) and their downstream interferons (IFNs) and IFN-stimulated genes (ISGs).These results strongly suggest that AT2s mount a strong primary antiviral innate immune response against SARS-CoV-2 infection, which plausibly plays a role in clearing the viral infection.The AT2 ′ s innate immune response to SARS-CoV-2 infection across 10 lines was directly proportional to the measures of SARS-CoV-2 infection susceptibility and postinfection cellular viral load.However, it is not clear if post-SARS-CoV-2 infection response played a causal role in the interindividual variability observed in the infection susceptibility and/or viral load (Figure 5c,d).

Discussion
COVID-19 phenotypic variance has a significant genetic basis; however, factors that explain disease risk beyond age, sex, and cardiovascular and metabolic comorbidities are yet to be fully identified [24].In part, this is due to the complex etiology of the COVID-19 disease and difficulty in sampling certain disease states, less severe and nonsymptomatic cases in particular.Nonetheless, ARDS remained the predominant cause of COVID-19related deaths, and AT2s have emerged as one of the primary cell types affected [15,[39][40][41].In this study, we investigated AT2s' capacity to regulate SARS-CoV-2 infection and cellular viral load as surrogate measures of COVID-19 phenotypic variability using iPSC-derived AT2s from 10 different individuals and a transcriptome-wide approach.Our comprehensive transcriptomic and functional annotation analysis of the generated AT2s (Figures 1 and 2) and previously published studies validate that the human iPSC-derived AT2s possess a transcriptomic and functional profile that is very similar to human primary AT2s and therefore are the close surrogates and the relevant cells to model SARS-CoV-2 infection-induced lung damage [29,42,43].To the best of our knowledge, our SARS-CoV-2 infection challenge of the generated 10 AT2 lines in in vitro cultures for the first time shows individual-specific variability in the AT2s' susceptibility to SARS-CoV-2 infection and postinfection cellular viral load, which is highly reproducible across replicates (Figure 3).This observed variability in the iPSC-derived AT2s' susceptibility to SARS-CoV-2 infection and cellular viral load plausibly recapitulates the COVID-19 phenotypic variability in disease susceptibility and SOI.Given that iPSC-derived cells are minimally influenced by life-experienced environmental exposures, it is highly likely that the observed interindividual variation in AT2 response to SARS-CoV-2 infection is driven by genetic factors [44][45][46].SARS-CoV-2 infection induced a strong primary immune response in AT2s, RNA virus sensing PRRs DDX58 (also known as RNA sensor RIG-I or RIGI) and IFIH1 (also known as melanoma differentiation-associated gene 5 or MDA-5), PRR-induced IFN regulatory factors (IRF1, IRF3, and IRF7), and their downstream IFNs (type I: INF-α/β; type II: IFN-γ; and type III: IFN-λ), and ISGs were upregulated or predicted to be upregulated (activation z-score ≥ 2.0) in SARS-CoV-2 infection-challenged AT2s (Table S1 and Figure 5c,d).Contrary to the findings in a previous report [47], our data strongly suggest induction of all three IFN classes (type I, II, and III) in the AT2s' SARS-CoV-2 infection response, albeit with high interindividual heterogeneity, the IFNs/ISGs activation was directly proportional to the measures of cellular viral load and percentage of infected cells (Figure 5c,d).However, transcriptomic expression of the IFN genes (IFNB1, IFNL1, IFNL2, and IFNL3) was detected only in SARS-CoV-2-infected AT2 lines, which had a higher viral load and did not pass the threshold set for the expressed gene in our differential gene expression analysis.Interindividual heterogeneity in SARS-CoV-2 infection response in peripheral blood mononuclear cells and viral load-associated tropism to IFN response in postmortem lung tissue has also been reported in recent studies [48,49].
Whilst the effect of post-SARS-CoV-2 infection innate immune response on the infection susceptibility and postinfection viral load in AT2s is not clear, AT2s' pre-infection innate immunity showed a significant inverse relationship with SARS-CoV-2 infection susceptibility and postinfection viral load.The pre-infection expression of 85 genes that were also DE post-SARS-CoV-2 infection was significantly negatively correlated with postinfection viral load and infection susceptibility in the 10 AT2 lines.The significant enrichment of these genes in viral recognition and cytokine-mediated innate immune GO terms (Table 1) strongly suggest that pre-infection interindividual variability in the innate immune processes is causally associated with SARS-CoV-2 infection susceptibility and postinfection viral load and plausibly plays a role in COVID-19 susceptibility and SOI.This antagonism between pre-infection innate immunity and COVID-19 susceptibility and SOI has also been suggested in children's upper airway epithelial cells, macrophages, and den-dritic cells [10].Furthermore, heterologous IFNs/ISGs response to prior RV infections has been reported to suppress SARS-CoV-2 replication throughout the epithelium in in vitro experiments [9].Pre-activated innate immunity likely preempts IFN/ISG induction before SARS-CoV-2-expressed proteins antagonize and delay such response.
Interestingly, the pre-infection expression of 153 genes, which were also DE post-SARS-CoV-2 infection, correlated significantly positively with post-SARS-CoV-2 infection viral load and infection susceptibility (Table S2).The significant enrichment of these genes in cholesterol homeostasis, ECM, and MAPK/ERK pathway-related GO terms (Table 1) suggest that these metabolic processes may have an agonistic role in SARS-CoV-2 viral infection and replication, while a plethora of evidence, including our own findings in AT2s, shows that, postinfection, SARS-CoV-2 inhibits host cholesterol metabolism (reviewed in [50,51]).Increasing evidence also supports the view that the cholesterol-and sphingolipids-rich lipid rafts play a vital role in SARS-CoV-2 infection.Intriguingly, the SARS-CoV-2 spike (S) protein contains putative cholesterol recognition amino acid consensus motifs, which can directly interact with cholesterol and an antibody, blocking the putative cholesterol binding site of the SARS-CoV-2 S protein and strongly inhibiting SARS-2-S pseudovirus infection [52].Furthermore, two independent CRISPR screens showed that cholesterol metabolism genes including the genes involved in sterol metabolism are essential for SARS-CoV-2 infection [53,54].Also, the human lung Calu-3 cells treated with high cholesterol developed a higher SARS-CoV-2 viral load [55].A pronounced remodeling of ECM was reported in COVID-19 cases [56,57].However, the role of ECM in SARS-CoV-2 infection susceptibility and viral replication is enigmatic.The ADAMTS proteins may play a role in post-SARS-CoV-2 infection fibrosis and regulation of the TGF-β pathway [58].However, their role in infection susceptibility and/or postinfection viral load is not clear.A higher protein level of MMP10 in the serum of nonsymptomatic SARS-CoV-2-infected individuals was reported in one study [59].However, MMP10 gene expression was positively correlated with SARS-CoV-2 viral load and showed a significant downregulation in post-SARS-CoV-2 infection in our study.As stated in the results, the pre-infection expression of the TMPRSS6 gene showed the highest positive correlation with SARS-CoV-2 viral load in 10 AT2 lines.The TMPRSS6 is a liver serine protease that negatively regulates hepcidin [60][61][62][63].A higher level of hepcidin suggests a pro-inflammatory state [64,65], which is likely less conducive to SARS-CoV-2 infection and/or viral replication.Alternatively, a lower hepcidin level may promote viral infection and replication.What significance these connections might hold needs further study.Pre-infection positive regulation of MAPK-ERK1/2 pathways showed a positive correlation with AT2s' susceptibility to SARS-CoV-2 infection and postinfection viral load.A recent study showed that treatment of Caco-2 cells with SARS-CoV-2 spike protein's receptor binding domain activated the key mediators (ERK1/2 and CRAF) of MAPK signaling, and inhibition of MAPK signaling by MEK inhibitors reduced SARS-CoV-2 infection [66].

Conclusions
Overall, our study demonstrates that human iPSC-derived AT2s are transcriptionally and functionally very similar to human primary AT2s and therefore are close surrogates to investigate interindividual variability in SARS-CoV-2-induced lung damage.Our results for the first time show donor-specific variability in iPSC-derived AT2s' susceptibility to SARS-CoV-2 infection and postinfection viral load.This observed variability in the iPSCderived AT2s' susceptibility to SARS-CoV-2 infection and viral load plausibly recapitulates the COVID-19 phenotypic variability observed in patients.We report a comprehensive list of 1393 genes that constituted AT2s' response to SARS-CoV-2 infection and show significant induction of AT2s' innate immunity pathways and downregulation of cholesterol and xenobiotic metabolic pathways.More interestingly, our results show that AT2s' pre-infection innate immunity and metabolic state significantly affected their susceptibility to SARS-CoV-2 infection and viral load.There are a few important caveats to our study.We focused only on AT2s' cellular response to SARS-CoV-2 infection, a cell type primarily affected in SARS-CoV-2 distal lung infection.The SARS-CoV-2 infection challenge was performed in AT2 in vitro cultures.However, other cell types including the likely recruitment of immune cells in the infection response may confound the AT2 response in COVID-19 patients.The iPSC-derived cells are minimally influenced by life-experienced environmental exposures; it is highly likely that the observed interindividual variation in AT2 response to SARS-CoV-2 infection is driven by genetic/biological factors but lacks environmental influence.Lastly, we did not have COVID-19 data on individuals whose samples were used for iPSC derivation.

Informed Consent Statement:
The MAFS participants whose validated iPSC lines were used in this study were originally recruited in our San Antonio Family Heart Study (SAFHS) and provided appropriate written consent.

Figure 1 .
Figure 1.AT2 generation and characterization.(a) A schematic outline of the AT2 differentiation protocol.(b) Morphological and ICC analysis of the generated AT2s' 3D alveolospheres showing expression of AT2-specific markers NKX2-1, EpCAM, SFTPB, and SFTPC.(c) (i,ii) Representative enlarged images detailing HCS quantification methodology of the nuclear and cytoplasmic expression of AT2 markers: (i) quantification of NKX2-1 nuclear expression, (ii) expression of EpCAM, SFTPB, and SFTPC markers measured in the cytoplasmic area marked by the white line minus the nuclear area.(d) Representative ICC images of each AT2 marker expression in monolayer culture; expression of each marker was quantitatively measured, as detailed in 'c', and cells passing a fluorescence threshold, set uniformly across all samples, were counted positive for the respective marker and identified as red/green highlighted nuclei.(e) Bar graphs showing the percentage of cells positive for each AT2 marker across 10 generated AT2 lines.

Figure 1 .
Figure 1.AT2 generation and characterization.(a) A schematic outline of the AT2 differentiation protocol.(b) Morphological and ICC analysis of the generated AT2s' 3D alveolospheres showing expression of AT2-specific markers NKX2-1, EpCAM, SFTPB, and SFTPC.(c) (i,ii) Representative enlarged images detailing HCS quantification methodology of the nuclear and cytoplasmic expression of AT2 markers: (i) quantification of NKX2-1 nuclear expression, (ii) expression of EpCAM, SFTPB, and SFTPC markers measured in the cytoplasmic area marked by the white line minus the nuclear area.(d) Representative ICC images of each AT2 marker expression in monolayer culture; expression of each marker was quantitatively measured, as detailed in 'c', and cells passing a fluorescence threshold, set uniformly across all samples, were counted positive for the respective marker and identified as red/green highlighted nuclei.(e) Bar graphs showing the percentage of cells positive for each AT2 marker across 10 generated AT2 lines.

Figure 2 .
Figure 2. Transcriptomic characterization of the iPSC generated AT2 lines.(a) Heat map of genes that were significantly differentially expressed between iPSCs and differentiated AT2 lines.(b) Gene expression violin plots of AT2-specific genes in iPSCs and differentiated 10 AT2 lines showing significant upregulation of AT2-specific genes in all generated AT2 lines.(c) Gene set enrichment analysis plots of AT2s' upregulated transcriptome in Human Gene Atlas, Azimuth Cell Types, and COVID-19-related gene sets; * enrichment significance p-values.(d) Correlation coefficient (r 2 ) plot based on the 13,285 genes found expressed (NRC ≥ 10) in the generated AT2s.(e) Gene expression plot of ACE2 and TMPRSS2 in the generated AT2s.

Figure 2 .
Figure 2. Transcriptomic characterization of the iPSC generated AT2 lines.(a) Heat map of genes that were significantly differentially expressed between iPSCs and differentiated AT2 lines.(b) Gene expression violin plots of AT2-specific genes in iPSCs and differentiated 10 AT2 lines showing significant upregulation of AT2-specific genes in all generated AT2 lines.(c) Gene set enrichment analysis plots of AT2s' upregulated transcriptome in Human Gene Atlas, Azimuth Cell Types, and COVID-19-related gene sets; * enrichment significance p-values.(d) Correlation coefficient (r 2 ) plot based on the 13,285 genes found expressed (NRC ≥ 10) in the generated AT2s.(e) Gene expression plot of ACE2 and TMPRSS2 in the generated AT2s.

Figure 3 .
Figure 3. Interindividual variability in AT2s' susceptibility to SARS-CoV-2 infection and viral load.(a) Post SARS-CoV-2 infection challenge ICC-based HCS quantification of SARS-CoV-2 nucleocapsid protein-positive AT2s (a total of 9 randomly chosen visual fields per AT2 line were analyzed for the quantitative measures of SARS-CoV-2-positive cells).(b) Comparative bar graph of percent of SARS-CoV-2-positive cells across 10 AT2 samples quantified by HCS analysis of the SARS-CoV-2infected replicate R1.(c) Representative enlarged images detailing HCS quantification methodology; (i) SARS-CoV-2 nucleocapsid protein-associated ICC fluorescence was measured in the cell cytoplasm of each cell marked by the white cell boundary minus nuclear area; (ii) cells passing a fluorescence threshold, set uniformly across all samples, were counted positive for SARS-CoV-2 infection and are represented by green nuclei in the image.(d) Comparative bar graph of RT-PCR-based relative quantification of the SARS-CoV-2 nucleocapsid protein gene expression (viral load) across 10 AT2 samples (measurements were performed on SARS-CoV-2-infected replicates R2 and R3 of all 10 AT2

Figure 3 .
Figure 3. Interindividual variability in AT2s' susceptibility to SARS-CoV-2 infection and viral load.(a) Post SARS-CoV-2 infection challenge ICC-based HCS quantification of SARS-CoV-2 nucleocapsid protein-positive AT2s (a total of 9 randomly chosen visual fields per AT2 line were analyzed for the quantitative measures of SARS-CoV-2-positive cells).(b) Comparative bar graph of percent of SARS-CoV-2-positive cells across 10 AT2 samples quantified by HCS analysis of the SARS-CoV-2-infected replicate R1.(c) Representative enlarged images detailing HCS quantification methodology; (i) SARS-CoV-2 nucleocapsid protein-associated ICC fluorescence was measured in the cell cytoplasm of each cell marked by the white cell boundary minus nuclear area; (ii) cells passing a fluorescence threshold, set uniformly across all samples, were counted positive for SARS-CoV-2 infection and are represented by green nuclei in the image.(d) Comparative bar graph of RT-PCR-based relative quantification of the SARS-CoV-2 nucleocapsid protein gene expression (viral load) across 10 AT2 samples (measurements were performed on SARS-CoV-2-infected replicates R2 and R3 of all 10 AT2 lines).(e) Change in ACE2 gene expression between mock and SARS-CoV-2 infection-challenged AT2s.

Figure 4 .
Figure 4. AT2s' transcriptomic response to SARS-CoV-2 infection: expression heat map of the top and bottom 20 genes that were significantly up-or downregulated between mock and SARS-CoV-2 infection-challenged AT2s.

Figure 5 .
Figure 5. AT2s' cellular response to SARS-CoV-2 infection.(a) Top 20 GO biological processes that were significantly enriched in 1393 DE genes.(b) Top 20 KEGG pathways that were significantly enriched in 1393 DE genes.(c) Activation z-score heatmap of top 20 IPA canonical pathways, which were significantly enriched and differentially regulated post-SARS-CoV-2 infection in 10 AT2 lines; * enrichment significance p-values.(d) Activation z-score heatmap of the upstream regulators that were predicted to be significantly up-or downregulated based on the 1393 DE genes.

Figure 5 .
Figure 5. AT2s' cellular response to SARS-CoV-2 infection.(a) Top 20 GO biological processes that were significantly enriched in 1393 DE genes.(b) Top 20 KEGG pathways that were significantly enriched in 1393 DE genes.(c) Activation z-score heatmap of top 20 IPA canonical pathways, which were significantly enriched and differentially regulated post-SARS-CoV-2 infection in 10 AT2 lines; * enrichment significance p-values.(d) Activation z-score heatmap of the upstream regulators that were predicted to be significantly up-or downregulated based on the 1393 DE genes.
E.C., S.W.-B.and J.B.; supervision and project administration, S.W.-B.and J.B.; funding acquisition, S.K., J.E.C., S.W.-B.and J.B.All authors have read and agreed to the published version of the manuscript.Funding: Data collection of the MAFS participants originally recruited in the San Antonio Family Heart Study (SAFHS) was supported by the National Institutes of Health (NIH) Grant P01 HL045522.Part of this work was supported by a philanthropic grant from the Valley Baptist Legacy Foundation to STDOI's THRIVE Center for Regenerative Medicine laboratory (510000000) and NIH Grants U54 HG013247, U19 AG076581 and RM1 GM149403.This work was conducted in part in facilities constructed under the support of NIH grant C06 RR020547.Institutional Review Board Statement: The study protocols were approved by the Institutional Review Board of the University of Texas Rio Grande Valley, Edinburg (IRB-18-0245, June 2015, last renewed on 7 August 2023).

Table 1 .
GO biological processes (terms) that were significantly enriched in the 238 genes whose expression was highly correlated with post-SARS-CoV-2 infection viral load in AT2 lines.